Offering a Precision-Performance Tradeoff for Aggregation Queries over Replicated Data
نویسندگان
چکیده
Strict consistency of replicated data is infeasible or not required by many distributed applications, so current systems often permit stale replication, in which cached copies of data values are allowed to become out of date. Queries over cached data return an answer quickly, but the stale answer may be unboundedly imprecise. Alternatively, queries over remote master data return a precise answer, but with potentially poor performance. To bridge the gap between these two extremes, we propose a new class of replication systems called TRAPP (Tradeoff in Replication Precision and Performance). TRAPP systems give each user fine-grained control over the tradeoff between precision and performance: Caches store ranges that are guaranteed to bound the current data values, instead of storing stale exact values. Users supply a quantitative precision constraint along with each query. To answer a query, TRAPP systems automatically select a combination of locally cached bounds and exact master data stored remotely to deliver a bounded answer consisting of a range that is no wider than the specified precision constraint, that is guaranteed to contain the precise answer, and that is computed as quickly as possible. This paper defines the architecture of TRAPP replication systems and covers some mechanics of caching data ranges. It then focuses on queries with aggregation, presenting optimization algorithms for answering queries with precision constraints, and reporting on performance experiments that demonstrate the fine-grained control of the precision-performance tradeoff offered by TRAPP systems.
منابع مشابه
Compact Representations of Event Sequences
We introduce a new technique for the efficient management of large sequences of multidimensional data, which takes advantage of regularities that arise in real-world datasets and supports different types of aggregation queries. More importantly, our representation is flexible in the sense that the relevant dimensions and queries may be used to guide the construction process, easily providing a ...
متن کاملEnergy-Conscious Data Aggregation Over Large-Scale Sensor Networks
Recent advances in hardware technology facilitate applications requiring large numbers of sensor devices, where each sensor device has computational, storage, and communication capabilities. Since sensor devices are powered by ordinary batteries, power is a limiting resource in sensor networks. Power usage can be reduced by pushing part of the computation into the network to reduce communicatio...
متن کاملPower-aware Query Processing over Sensor Networks
Recent advances in hardware technology make applications requiring large numbers of sensor devices possible, where each sensor device has computation, memory, and communication capabilities. Since sensor devices are powered by ordinary batteries, power is a limiting resource in sensor networks. Some work has been proposed to reduce the power usage by pushing part of the computation into the net...
متن کاملExternal Plagiarism Detection based on Human Behaviors in Producing Paraphrases of Sentences in English and Persian Languages
With the advent of the internet and easy access to digital libraries, plagiarism has become a major issue. Applying search engines is one of the plagiarism detection techniques that converts plagiarism patterns to search queries. Generating suitable queries is the heart of this technique and existing methods suffer from lack of producing accurate queries, Precision and Speed of retrieved result...
متن کاملSMART: Adaptive Precision Setting for Aggregation Queries over Distributed Data Streams
We present SMART, a load-aware, self-tuning algorithm for processing continuous aggregate queries in distributed data stream systems. SMART maximizes query result accuracy while keeping monitoring bandwidth below a specified budget despite potentially bursty data streams whose workload characteristics change over time. To accomplish this goal, SMART’s hierarchical algorithm computes for each no...
متن کامل